perm filename TITLE[0,BGB]7 blob sn#078582 filedate 1973-12-23 generic text, type C, neo UTF8
COMMENT āŠ—   VALID 00004 PAGES
C REC  PAGE   DESCRIPTION
C00001 00001
C00002 00002	TITLE PAGE - 1.						  MARCH 1974.
C00005 00003	TITLE PAGE - 2.						  MARCH 1974.
C00008 00004	1.0 INTRODUCTION.
C00015 ENDMK
CāŠ—;
TITLE PAGE - 1.						  MARCH 1974.
draft - draft - draft - draft - draft - draft - draft - draft - draft


               GEOMETRIC MODELING FOR COMPUTER VISION.


                          BRUCE G. BAUMGART


ABSTRACT:

	This thesis is about a computer graphics approach to computer
vision.   The main design idea  is that a 3-D  geometric model of the
physical world is  an expedient bridge  between image processing  and
artificial  intelligence.   This  idea  is  developed into  a  vision
modeling system  consisting of two programs named GEOMED and CRE. The
system is demonstrated solving description,   recognition and verifi-
cation problems in the context of viewing objects on a turntable.

CONTENTS:

	PART ONE. THEORY.
		 1.0	Introduction.
		 2.0	Computer Vision Theory.
		 3.0	Geometric Modeling Theory.

	PART TWO. PROGRAMMING.
		 4.0	Memory  - data structures.
		 5.0	Process - algorithms.
		 6.0	Control - command languages.

	PART THREE. APPLICATION.
		 7.0	Demonstrated Applications.
		 8.0	Proposed Applications.
		 9.0	Conclusion.

	ADDENDUM.
		10.0	Glossary.
		11.0	References.
		12.0	Appendices.

---------------------------------------------------------------------
This  research  was  supported  in  part  by  the  Advanced  Research
Projects  Agency of  the  Office of  the Secretary  of  Defense under
Contract No. SD-183.

The views and  conclusions contained  in this document  are those  of
the author and should not  be interpreted as necessarily representing
the  official policies, either expressed or  implied, of the Advanced
Research Project Agency or the United States Government.
TITLE PAGE - 2.						  MARCH 1974.

-1st draft-    GEOMETRIC MODELING FOR COMPUTER VISION.
---------------------------------------------------------------------
                           A DISSERTATION
           SUBMITTED TO THE DEPARTMENT OF COMPUTER SCIENCE
                AND THE COMMITTEE ON GRADUATE STUDIES
                       OF STANFORD UNIVERSITY
             IN PARTIAL FULFILLMENT OF THE REQUIREMENTS
                          FOR THE DEGREE OF
                        DOCTOR OF PHILOSOPHY
---------------------------------------------------------------------
                                 BY
                          BRUCE G. BAUMGART
                             MARCH  1974
---------------------------------------------------------------------
ACKNOWLEDGEMENTS

	John Mc Carthy    - thesis adviser.
	Jerome A. Feldman - reader.
	Donald E. Knuth   - reader.
	Alan C. Kay	  - reader.

Hans Moravec, Les Earnest, Lou Paul, Russel Taylor, Robert Sproull,
Jeff Raskin, Steve Gibson, Arthur Thomas, Bruce Anderson, Yorick Wilks,
Ralph Gorin, Tovar Mock, Lynn Quam,
Tom Gafford, Ted Panofsky, Andy Moorer, Dan Swinehart,
Ron Rivest, Jack Buchanan, Ivan Sutherland, Tom Binford, 
---------------------------------------------------------------------
LIST OF ILLUSTRATIONS.

DETAILED TABLE OF CONTENTS.
---------------------------------------------------------------------
1.0 INTRODUCTION.

	This thesis is about a computer graphics approach to computer
vision.   The main design idea  is that a 3-D  geometric model of the
physical world is  an expedient bridge  between image processing  and
artificial intelligence.  Such a  geometric model provides a goal for
image  analysis,  an  origin for image  synthesis (for verification),
and a data structure for spatial problem solving and planning.
	
	The chapters proceed  from the general  to the particular  in
three parts: theory,  programming,  and application.

	In Part I, the theory consists of two essays: chapter 2 is an
essay on vision and chapter 3 is an essay on geometric modeling.  The
vision theory presented is speculative and much  larger in scope than
the  subsequent results;  so although  the  vision theory  guided the
design of programs and applications, I do not wish to claim  that the
vision  demonstrated  comes  close  to confirming  the  theory.    In
particular, if the reader skips chapter 2, the rest of the thesis can
be viewed as a discussion of a 3-D drawing  program for automatically
generating  and altering  polyhedral scene  descriptions by  means of
video input.

	In Part II, two  computer programs named  CRE and GEOMED  are
explained. CRE  is a  solution to  the problem  of finding  intensity
contours  in  a  sequence  of  television  pictures  and  of  linking
corresponding contours from one picture  to the next. The process  is
automatic and  is intended  to run without  human intervention.   The
image  sequence output of  CRE is input  to GEOMED, a  package of 3-D
modeling routines.   In  GEOMED,   the  perceived CRE  images may  be
compared with synthetic images  computed by a hidden line eliminator;
the perceived images may  be used to  generate new polyhedral  object
descriptions; or  the  images may  be used  to solve  for the  camera
locus. The programming discussion is broken into three parts: memory,
process and control in chapters four, five and six respectively.   In
the memory  chapter,  a  small number  of entities are  introduced as
atoms,   a node representation  for each atom is  explained,  and the
assembly of  atoms to  represent further entities  is begun.  Chapter
five, on  process, explains the bulk  of the work, which  has been to
develope a system  of routines that  do geometric modeling.  Although
most of the  techniques and problems  discussed in chapter  five have
been recognized  as relevant to computer vision,  there has been very
little written about how  the system is integrated. Finally,  chapter
six explains the command languages which define the interface between
the  modeling system and its application.   The command languages are
notable more for concise comprehensive notation rather than for human
engineering; and so must be viewed as low level.

	In Part III, the machinery  of part II is applied to a number
of vision and model related problems.